A Novel Prioritization Technique for Solving Markov Decision Processes

نویسندگان

Jilles Steeve Dibangoye

Brahim Chaib-draa

Abdel-Illah Mouaddib

چکیده

We address the problem of computing an optimal value function for Markov decision processes. Since finding this function quickly and accurately requires substantial computation effort, techniques that accelerate fundamental algorithms have been a main focus of research. Among them prioritization solvers suggest solutions to the problem of ordering backup operations. Prioritization techniques for ordering the sequence of backup operations reduce the number of needed backups considerably, but involve significant overhead. This paper provides a new way to order backups, based on a mapping of states space into a metric space. Empirical evaluation verifies that our method achieves the best balance between the number of backups executed and the effort required to prioritized backups, showing order of magnitude improvement in runtime over number of benchmarks.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Accelerated decomposition techniques for large discounted Markov decision processes

Many hierarchical techniques to solve large Markov decision processes (MDPs) are based on the partition of the state space into strongly connected components (SCCs) that can be classified into some levels. In each level, smaller problems named restricted MDPs are solved, and then these partial solutions are combined to obtain the global solution. In this paper, we first propose a novel algorith...

متن کامل

Topological Orders Based Planning for Solving POMDPs

Although partially observable Markov decision processes (POMDPs) have received significant attention in past years, to date, solving problems of realistic order of magnitude remains a serious challenge. In this context, techniques that accelerate fundamental algorithms have been a main focus of research. Among them prioritized solvers suggest solutions to the problem of ordering backup operatio...

متن کامل

Scaling Up: Solving POMDPs through Value Based Clustering

Partially Observable Markov Decision Processes (POMDPs) provide an appropriately rich model for agents operating under partial knowledge of the environment. Since finding an optimal POMDP policy is intractable, approximation techniques have been a main focus of research, among them point-based algorithms, which scale up relatively well up to thousands of states. An important decision in a point...

متن کامل

Symbolic LAO* Search for Factored Markov Decision Processes

We describe a planning algorithm that integrates two approaches to solving Markov decision processes with large state spaces. It uses state abstraction to avoid evaluating states individually. And it uses forward search from a start state, guided by an admissible heuristic, to avoid evaluating all states. These approaches are combined in a novel way that exploits symbolic model-checking techniq...

متن کامل

Safe Q-Learning on Complete History Spaces

In this article, we present an idea for solving deterministic partially observable markov decision processes (POMDPs) based on a history space containing sequences of past observations and actions. A novel and sound technique for learning a Q-function on history spaces is developed and discussed. We analyze certain conditions under which a history based approach is able to learn policies compar...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2008

A Novel Prioritization Technique for Solving Markov Decision Processes

نویسندگان

چکیده

منابع مشابه

Accelerated decomposition techniques for large discounted Markov decision processes

Topological Orders Based Planning for Solving POMDPs

Scaling Up: Solving POMDPs through Value Based Clustering

Symbolic LAO* Search for Factored Markov Decision Processes

Safe Q-Learning on Complete History Spaces

عنوان ژورنال:

اشتراک گذاری